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Dynamics of quasi-stationary systems: Finance as an example 
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We propose a combination of cluster analysis and stochastic process analysis to characterize high¬ 
dimensional complex dynamical systems by few dominating variables. As an example, stock market 
data are analyzed for which the dynamical stability as well as transitions between different stable 
states are found. This combined method also allows to set up new criteria for merging clusters 
to simplify the complexity of the system. The low-dimensional approach allows to recover the 
high-dimensional fixed points of the system by means of an optimization procedure. 
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I. INTRODUCTION 


We calculate the relative price changes, i.e. the returns 


For complex dynamical systems consisting of many 
interacting subsystems it is a general challenge to re¬ 
duce the high dimensionality to a few dominating vari¬ 
ables that characterize the system. Cluster analysis is a 
method to group elements according to their similarity. 
However, there is an ambiguity as different algorithms 
may lead to different clusters mm- The dynamical be¬ 
havior of complex systems may reduce their complexity 
by self-organization. Here the high-dimensional dynam¬ 
ics generates a few order parameters evolving slowly on 
a strongly fluctuating background [5'. With the help of 
stochastic methods it is possible to show such simplified 
dynamics and to estimate a Langevin equation directly 
from the data, i.e. the data are analyzed as a stochastic 
process with drift and diffusion term [;3]. 

We show that cluster analysis and stochastic methods 
can be combined in a fruitful way. Cluster analysis does 
not aim at grasping dynamical effects. By combining it 
with stochastic methods we show that an improved dy¬ 
namical cluster classification can be obtained. Further¬ 
more we extract dynamical cluster features of a complex 
system as emerging and disappearing clusters. Related 
ideas were put forward by Hutt ef al. [Sj for detecting 
fixed points in spatiotemporal signals. Here, we focus on 
financial data, for which quasi-stationary market states 
were identified in the correlation structure using cluster 
analysis [13]. We briefly sketch these findings before de¬ 
veloping our combined analysis. 


II. MARKET STATES: CLUSTERING 

We study the daily closing prices Sk ( t ) of the S&P 500 
stocks for the period 1992 - 2012 m , which aggregate 
to 5189 trading days. Only the 306 stocks that were con¬ 
tinuously traded during the whole period are considered. 
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for each stock k for a time horizon At of one trading day 
(td). To avoid an estimation bias due to time-varying 
trends and fluctuations, we perform a local normalization 
m of the returns before estimating their correlations. 
We denote the locally normalized returns by fk{t). We 
then calculate the Pearson correlation coefficients 


Ca = 


(fjfj) - () 
CTiCTj 


( 2 ) 


Here (...) denotes the average over 42 trading days. We 
obtain a correlation matrix C(t) for each time t by mov¬ 
ing the calculation window in steps of one trading day 
through the time series. 

All of the considered companies are classified by ten 
industry sectors according to the Global Industry Clas¬ 
sification Standard (GICS) [Q. To reduce the estima¬ 
tion noise, we average the correlation coefficients within 
each sector and between different sectors, leading to 
10 x 10 matrices. While this sector-aver aged matrix is 
symmetric, its diagonal is not trivial. Thus, it contains 
d = (10 2 + 10)/2 = 55 independent entries. The crucial 
idea is to identify each averaged correlation matrix with 
an element in the real d-dimensional Euclidian space R d , 
so that a similarity, i.e. a distance, between any two 
correlation matrices can be calculated. Throughout this 
paper all distances are measured in the Euclidean norm 
which we denote by || ... ||. 

As the next step we use the bisecting fc-means clus¬ 
tering algorithm m- At the beginning of the clustering 
procedure all of the correlation matrices are considered 
as one cluster, which is then divided into two sub clusters 
using the fc-means algorithm with k = 2. This separa¬ 
tion procedure is repeated until the size of each cluster 
- in terms of the mean distance of the cluster members 
to the cluster center - is smaller than a given threshold. 
We choose the mean distance to be smaller than 1.564 to 
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achieve 8 clusters as in Ref. m- The market is said to 
be in a market state i at time t, if the corresponding cor¬ 
relation matrix is in the *th cluster. In Figure [l] (a) the 
corresponding clustering tree is shown. Figure (b) de¬ 
picts the temporal evolution of the financial market. New 
clusters form and existing clusters vanish in the course 
of time. And while the first 1000 trading days are dom¬ 
inated by state 1 and only occasional jumps to state 3, 
we observe more frequent jumps and less stable behavior 
in more recent times. 
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FIG. 1: Clustering tree (a) and temporal evolution of the 
states (b). 



To quantify the market situation at time t, the distance 


Xi(t) =11 C(t) - Hi || (3) 

of the correlation matrix C{t) to the eight cluster centers 
Hi (i = 1, 2,..., 8) is calculated. Two of the eight resulting 
time series are shown in Fig. [5] These time series depict 
the temporal evolution of the system seen from the re¬ 
spective cluster centers. Figure [2] shows that the market 
is in the beginning close to cluster center 1 and far away 
from cluster center 6. This changes over time in accor¬ 
dance with Fig. [l] (b): cluster 6 occurs later while cluster 
1 is present during the first half of the time period. 


III. STOCHASTIC ANALYSIS 


A wide class of dynamical systems from different fields 
are modeled as stochastic processes and thus described by 
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FIG. 2: (Color online) Evolution of the distance § of the 
system to cluster center 1 (dashed, red) and cluster center 6 
(black). 


a stochastic differential equation, the Langevin equation 

sum 


X(t) = + y/DW(X,t)T(t ), (4) 

which describes the time evolution of the system variable 
A as a sum of a deterministic function t) and a 

stochastic term D^ 2 \X, t)T(t). The stochastic force T(t) 
is Gaussian distributed with (T(f)) = 0 and (r(f)r(t')) = 
25(t — t'). For stationary continuous Markov processes 
Siegert et al. m and Friedrich et al. J] showed that it 
is possible to extract drift and diffusion functions directly 
from measured time series using the Kramers-Moyal coef¬ 
ficients Ml"', which are defined as conditional moments 


D M (x) 
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( 5 ) 
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Here x denotes the value of the stochastic variable X ( t ) 
at which the drift function (n = 1) and the diffusion 
function (n = 2) are evaluated. The average in Eq. ([6]) 
is performed over all realizations of X (t) for which the 
condition X(t) = x holds. Equation ([6]) expresses there¬ 
fore (modulo the exponent n) the mean increment of the 
variable X(t) after time step r if starting at the given 
value x at time t. The derivative of this mean increment 
with respect to r at r = 0 is equal to the value of the 
drift function for n = 1 and the diffusion function for 
n = 2 at x, as defined in Eq. ([5]). We refer to Ref. j3] for 
further details. 

In the present work we estimate the drift functions on 
time windows of 1000 trading days, on which the esti¬ 
mated quantities are treated as time independent. The 
time dependency is obtained by comparing the estimated 
drift functions on a sliding time window. Especially for 
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small data sets the estimation of the conditional moments 
(©) might be tedious. Additionally to the original esti¬ 
mation procedure proposed by Siegert et al. m and 
Friedrich et al. [5] we use here a kernel based regression 
as proposed in Ref. ©[TO]. 

Instead of analyzing the drift function itself, it is more 
convenient to consider the potential function defined as 
the negative primitive integral 



of the drift function. The minus sign is a convention. 
The dynamics of the system is encoded in the shape of 
<J>(a;): the local minima of the potential function cor¬ 
respond to quasi-stable equilibria, or quasi-stable fixed 
points, around which the system oscillates. In contrast, 
local maxima correspond to unstable fixed points. We 
note that, due to definition, potential functions are de¬ 
fined up to an additive constant which is here set to zero. 
We further note that both the drift function and the po¬ 
tential function have the dimension of inverse time. 


IV. OPTIMIZING THE ALGORITHM 

One drawback of clustering is that the number of clus¬ 
ters is in general not known a priori. Therefore a thresh¬ 
old criterion is used. Furthermore, clustering is based 
only on geometrical properties like positions and dis¬ 
tances between elements, i.e. their similarity. The dy¬ 
namics of the system is not involved. As a new approach 
we combine the clustering analysis with the stochastic 
analysis. The aim is to extract dynamical attributes from 
time series and derive stability features and quasi-stable 
fixed points. 

From the eight time series as defined in Eq. (©) we cal¬ 
culate the deterministic potentials &(x)i (i = 1, 2,..., 8) 
as defined in Eq. ([©). To grasp the time evolution of the 
clusters, i.e. market states, we calculate &i{x) on a win¬ 
dow of 1000 data points (~ four trading years) which we 
shift by 21 data points (« one trading month), resulting 
in 199 deterministic potentials <Fi(:r) for each cluster i. 

Figure [3] shows a sample of these potentials for differ¬ 
ent time windows. The dotted vertical lines denote the 
distance to the cluster centers which are labeled at the 
abscissa. Each potential in the five figures shows a clear 
minimum. Hence, the market dynamics expressed by the 
correlation matrices performs a noisy dynamics around 
the attractive fixed point, which is defined by the mini¬ 
mum. Most interestingly the position of the minima of 
these potentials coincide quite well with the distances 
to the cluster centers obtained from the cluster anal¬ 
ysis. Additionally to the positions of the fixed points 
of the system, the potentials provide information about 
the stability of the market in the analyzed time window. 
Potential functions with more than only one clear local 


minimum, e.g. Fig. [3] (a) and (b), reflect an unstable dy¬ 
namics. In contrast an isolated and deep minimum as in 
Fig. ©(e), corresponds to a stable dynamics. 

In Fig. [3] (c) - (e) a transition from state 4 and 5, 
which are very close, to state 2 is shown. The quasi¬ 
stable fixed point of the potential function changes its 
position in time and moves closer to the center of the 
first cluster. We note that in the intermediate state, as 
shown in Fig. [ 3 ] (d), the width of the potential function 
is increased. Instead of a clear local minimum it has a 
rather flat plateau. The time evolution of the market 
reflected in the position change of the local minimum of 
the potential function is thus taken as a state transition. 
The ability of our approach to describe multi-stable and 
transitional behavior has also been shown by Miicke et 
al. m in the context of wind energy. Stability of market 
states as well as state transitions are studied in detail in 
Stepanov et al. [1R. 


V. MERGING OF THE CLUSTERS 

Besides the cases where the positions of the minima of 
the potentials coincide clearly with the distances to clus¬ 
ter centers, there are also less clear situations as shown in 
Fig. 0 In the considered time window / = [3109 : 4108] 
(expressed in td) the market is switching between the 
states 2, 3, 4, 5 and 6. The clustering procedure as de¬ 
scribed above doesn’t take the dynamics of the market 
into account. All of the correlation matrices are clustered 
at the same time without any information about how the 
market is jumping between the correlation matrices. The 
elements of a cluster are chosen in such a way that the 
sum of the distances to its cluster center is minimized. 

The stochastic analysis of the market dynamics seen 
from cluster center 1 as well as from cluster center 3, 
shows that the potential function exhibits a pronounced 
minimum between the distances to the cluster centers 5, 
4 and 6 as shown in Fig. [4] This fact is now taken as a 
criterion to change the cluster analysis in a way that the 
clusters 4, 5 and 6 will be combined to a new cluster 5*. 
That this merging of clusters is natural is indicated by 
the clustering tree of Figure [I] (a), as the three merged 
clusters arise from the split of the same ancestor cluster. 
If the threshold is chosen small enough, this cluster would 
not have been split. 

For reference we show in Fig. ©the temporal evolution 
of the states after merging clusters 4, 5 and 6 to clus¬ 
ter 5*. The distance to the center of the merged clus¬ 
ter as seen from clusters 1 and 3, respectively, coincides 
remarkably with the position of the local minimum of 
the potential as marked by the broken line (dash point) 
in Fig. [4] We thus conclude that while for the chosen 
threshold the clusters, i.e. market states 4, 5 and 6 are 
geometrically distinct, together these states form a single 
quasi-stationary state. While being in this state, the sys¬ 
tem fluctuates around the center of the cluster 5*, which 
is a fixed point of the system. We note that the mean 
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FIG. 3: Potentials exibit stability of fixed points (a), (b) and transitions between system states (c) - (e). The vertical dotted 
lines correspond to the respective cluster centers, see Fig. [2] The interval in days over which the potentials were obtained is 
denoted by [ti,t 2 ]- 


correlation matrix 

c =wX c{f] (8) 

1 tel 

of the analyzed time period I differs clearly from the cen¬ 
ter of the merged cluster /r 5 . with || ^i 5 .— C ||= 0.55. Here 
Nj denotes the length of I. Furthermore the distances 
to the cluster center 1 || //i — C ||= 1.42 and the cluster 
center 3 || Hz — C ||= 0.82 don’t match the positions of 
the minimum of the potential as shown in Fig. [4j 


VI. IDENTIFYING HIGH-DIMENSIONAL 
FIXED POINTS 

In the previous section we used clustering and stochas¬ 
tic analysis of the one-dinrensional time series to obtain 
fixed points of the system. We identified the positions 
of the local minima of the potential functions with the 
distances to the quasi-stationary fixed points. In this 
section we propose a method to identify quasi-stationary 
fixed points by an optimization problem. 


We recall that the mean correlation matrix (|8j) mini¬ 
mizes the sum of distances 

Eii A - c wn (9) 

tel 

of C(t) in / to a fixed correlation matrix A. For high- 
dimensional empirical data, distinct subspaces (princi¬ 
pal components) may exist along which data points are 
preferably distributed 0 nn cm. Therefore we require 
the fixed points of the system to be elements of these sub¬ 
spaces. As the sum of two elements of a given principal 
component is not necessarily an element of this compo¬ 
nent anymore, the calculation of the average (|8j) does not 
neccessarily yield the fixed point of the system. We gave 
an example of this in the previous section. The calcu¬ 
lated average C does not match the position of the fixed 
point found by stochastic analysis. 

Here we propose to identify system fixed points as the 
elements of the preferred subspaces which are the most 
similar to all C(t) in a given time interval I, i.e. they min¬ 
imize the sum of distances ([9]). In absence of any prefer¬ 
able subspaces we recover the mean correlation matrix 
C. This idea is similar to the definition of the geode- 
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FIG. 5: Temporal evolution of the remaining six states (top). 
The merged market state is denoted by 5*. The lower plot 
shows the time evolution of A(t) as defined in Eq. GD- The 
time points in which the market occupies the merged state 5* 
are marked by the vertical grey lines. 


FIG. 4: (Color online) Result of merging states 4, 5 and 6 
to match the minimum of the potential seen from state 1 (a) 
and state 3 (b). 

tic curves on an arbitrary manifold which minimize the 
distance between any two points and are not necessarily 
straight lines any more. 

As an application we consider the data sample from the 
previous section. As shown in Fig. [4] the stochastic anal¬ 
ysis in the time interval I = [3109 : 4108] (expressed in 
trading days) shows a deep minimum at X 0 ss 1.9 as seen 
from the center of the first market state /jq. The data 
points are therefore distributed approximately around 
Hi. The set of all possible fixed points is restricted to 
the points on the (d — l)-dimensional hypersphere of ra¬ 
dius Xo around fj To find the empirical fixed point for 
this setting we minimize under the condition 

II Mr - A ||= X 0 » 1.9, (10) 

which we solve by the method of Lagrange multipliers. 
We note that for the Euclidian norm the problem always 
has a unique minimum and maximum, unless the dis¬ 
tance ([3]) is constant. 

The empirical solution of the problem, denoted by Co, 
differs slightly from the center of the merged cluster de¬ 


fined in the previous section, || C 0 — Ms* ll~ 0.35. This 
is because the market does not exclusively stay in the 
merged state 5* during the analyzed time period /, as 
observed in Fig. m (a)- 

We now quantify the deviation of the obtained fixed 
point Co from the averaged correlation matrix C by look¬ 
ing at the difference 

A(t) =|| C(t) C 0 || || C(t) — C || (11) 

of the distances from C(t) to Co and C, respectively. 
The market is closer to Co than to C whenever A(f) < 0 
holds. 

The time series A(t) is shown in Fig. [5] (b). It is switch¬ 
ing between positive and negative values. The grey back¬ 
ground highlights the time points t at which the market 
occupies the merged state 5*, which remarkably coin¬ 
cidences with the negative values of A (t). We almost 
exactly quantify the merged market states during which 
A (t) falls rapidly down from positive values. Not only is 
the market state identified but also the dynamics of the 
market within the state. As seen from Fig. [ 5 ] (b), the 
values of A(t) continuously decrease and then increase 
again while the market occupies the merged state. In 
contrast A(t) is fluctuating around a positive value while 
the market is not in the merged state. 

As a consistency check, we applied the same algorithm 
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with the center of the third cluster ^ 3 , as well the overall 
mean correlation matrix as reference points. In all cases 
we obtained the same fixed point Co- 

VII. CONCLUSIONS 

The combination of the cluster analysis and the 
stochastic process analysis allows to characterize the dy¬ 
namics of a dynamical system as a noise process between 
different quasi-stationary states. For financial markets 
these market states are defined in terms of correlation 
matrices which reflect the dependence structure of the 
stock market. Especially, while being in a given mar¬ 
ket state, the market is fluctuating around the center 
of the corresponding cluster. A threshold criterion can 
produce geometrically distinct states, which are a sin¬ 
gle quasi-stationary state of the system. The deviation 
of the market situation at time t from individual mar¬ 
ket states is reflected in the distance between the cor¬ 
relation matrix C(t) and the respective cluster centers. 


This distance is taken as the new low-dimensional or¬ 
der parameter of the complex system. The stochastic 
analysis provides evidence of how the market dynamics 
is guided by a changing potential landscape with tempo¬ 
rally changing stability of the market states. Emerging 
new quasi-stable states are found. In this way we present 
a method with which the dynamics of a high-dimensional 
complex system is projected to low-dimensional collective 
dynamics. Furthermore we address the high dimension¬ 
ality of the data set by an optimization problem. This 
problem is well defined and is robust against the choice 
of reference points. We obtain high-dimensional quasi¬ 
stable fixed points of financial markets explicitly and see 
good chances to apply this method also to other complex 
states. 
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